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1. INTRODUCTION 

A genetic disease is a disease with differences in a person's genes or chromosomes. Genetic diseases 
consist of various diseases such as down syndrome, Noonan syndrome, Turner syndrome, Williams-Beuren 
syndrome, Cornelia de Lange, Klinefelter syndrome, and coronary artery disease (CAD). According to data 
from World Health Organization (WHO), out of 1,000 births, there is one who has down syndrome, with a 
total of 3,000-5,000 children born annually with this condition, and the estimated incidence in the population 
is about 1 in 1,000 live birth [1], [2]. As for the case of Turner syndrome, it is estimated that 1:2500 people 
worldwide suffer from it, especially in live female births [3]. If not known or detectable, the genetic disease 
can be dangerous for the individual who experiences it. In young children, genetic diseases can stunt growth, 
cause cognitive impairment, retard mental development, neurodevelopmental disabilities, cause differences 
in facial characteristics and body defects, and even cause death [4]. 

In the case of genetic diseases, it is usually necessary to carry out a genetic examination, commonly 
known as genetic testing. This test's results make it possible to identify changes or mutations in one or more 
chromosomes, genes, and proteins to determine the possibility of genetic disease [5]. Of course, this disease 
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needs to be confirmed by complex genetic testing, and genetic testing is only available and can be done in 
large hospitals. This examination takes a long time ranging from 2-3 weeks [6], but in some cases, it can take 
up to 8 weeks and the costs incurred are pretty significant. So, it is difficult for all groups to do this [7]. In 
addition, doctors sometimes perform a physical examination on specific physical characteristics such as 
facial features, and Geneticists mostly do physical examinations at an early stage. 

The doctor will look at the facial dysmorphology and will make a diagnosis based on it. Previous 
studies have also explained that faces with genetic diseases have general characteristics. In people with Down 
syndrome, the typical features are small eyes, a round face, a small chin, Bushfield spots on the iris, a flat 
nose bridge, and abnormal or folded outer ears [8]. 

Meanwhile, Noonan syndrome has facial characteristics that can change depending on the sufferer's 
age. During childhood, facial characteristics experienced are a high forehead, short nose, slanted philtrum, 
hypertelorism, and thick hooded eyelids [9]. Then, when growing up, there is a deep curve between the nose 
and mouth, wide eye gaps, low-set ears, a small lower jaw, possibly crooked teeth, a curved inner palate, and 
transparent and thin skin on the neck. Even the face is expressionless [10]. 

In patients with Cornelia de Lange syndrome, the characteristic facial features are thick eyebrows, 
short nose, long philtrum, upturned nasal trip, concave nasal ridge, synophrys, thin upper lip, small widely 
spaced teeth, crescent-shaped mouth, and short neck [11], [12]. Then Turner syndrome is characterized by 
slanted eyes and skin folds at the corners of the eyes, ptosis or drooping eyelids, and a small lower jaw. 
A short neck is also a feature of Turner syndrome because the sufferer has bone abnormalities [13]. 
Meanwhile, Williams-Beuren syndrome (WBS) has classic facial characteristics, such as spiked hair, wide 
eyebrows, wide forehead, short nose, flat nasal bridge, long philtrum, and wide mouth [14]. As a result, 
practitioners sometimes find it difficult to distinguish physical characteristics whose differences are not 
significant when seen by the eye [15]. 

Face recognition is one of the biometric systems currently being developed to identify a person. This 
is because every individual has unique facial structures that can be used to identify someone, even twins [16]. 
The development of face recognition has been widely applied in various fields, such as health, security and 
law, education, and computer and human interaction [17]. When integrated with artificial intelligence, the 
information obtained from a person's face would greatly aid human work. However, of course, using a 
computer to identify faces is also not easy because human facial features vary, and a person's facial features 
can be either static or dynamic [18]. With the help of machine learning, the computer can distinguish facial 
dysmorphology which is associated with genetic diseases [8]. 

In some cases, early genetic disease diagnosis is critical in determining the following steps, 
accompanied by a complete examination. Therefore, computer-assisted development using artificial 
intelligence can be helpful in the accurate diagnosis of genetic diseases. Such artificial intelligence can 
increase the efficiency of early diagnosis work and provide valuable information to doctors and patients [7]. 

According to several previous studies, research has been carried out on the development of detecting 
genetic diseases using facial recognition [7], [19], [20]. In a study conducted by Liu ef al. in 2021, 
researching the faces of patients with WBS using the convolutional neural network (CNN) showed promising 
results of 92.7% [7]. Then a study conducted by a research group from Turkey in 2012 with down syndrome 
patients produced a maximum accuracy of 97.34% using the Gabor wavelet transform and support vector 
machine (SVM) [19]. Meanwhile, research by Hong et al. also using CNN, can identify several genetic 
diseases with an accuracy of 88.6% from 456 children's facial data [20]. This means that the identification of 
genetic diseases from facial dysmorphology can be conducted and developed further. This review article aims 
to provide views and add insight to identify opportunities for further research in the application of artificial 
intelligence methods in identifying complex genetic diseases and finding out which artificial intelligence 
methods are the best in identifying genetic diseases. 


2. DEVELOPMENT OF FACE RECOGNITION 

Face recognition itself has developed since the 1950s and 1960s. In 1964, research on computer 
programming for semi-automatic facial recognition was carried out by Bledsoe and Wilson in seeing the 
mouth and eyes [21]. This was followed in 1970 by Takeo Kanade's study on facial matching systems with 
facial anatomical features, followed by the publication of a book on facial recognition technology seven years 
later [22]. 

Furthermore, there is the first successful research of facial recognition technology called 
"Eigenfaces" with a statistical method, namely principal component analysis (PCA), which was introduced 
by Turk and Pentland from the Massachusetts Institute of Technology in 1991. In this study, Turk and 
Pentland developed a model linear by combining factor analysis and the Karhunen—Loéve theorem [23]. 
From here, it leads to the further development of "Fisherfaces," which is improved by using linear 
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discriminant analysis (LDA) [24]. Then it was continued in 1998 by the Defense Advanced Research Project 
Agency (DARPA), which developed facial recognition technology to assist intelligence security called facial 
recognition technology (FERET) [25]. This continued to grow until, in 2011, facial recognition using an 
artificial neural network was developed, which led to the development of DeepFace on Facebook's internal 
algorithm in 2014 [15]. 

Then research on facial recognition automatically using artificial intelligence continues to grow. 
Face recognition can be a solution to identification and verification problems. A person's face can be 
compared to determine a person's identity. Face recognition itself can be done using image-based or video- 
based [17], but this review will focus on image-based. Likewise, in determining disease, face recognition has 
been widely used in early screening or early diagnosis of a disease that can be distinguished from facial 
features. This continues to grow until now, where in this review, there are 13 studies related to facial 
recognition in the last ten years which are used to detect genetic diseases. 


3. FACE RECOGNITION ALGORITHM 

In general, there are three main steps in completing face recognition which can be seen in Figure 1, 
namely face detection, where the system must understand whether the detected image has a face or not, then 
which feature extraction step is to recognize distinctive features or characteristics that exist on the face. 
Finally, the face recognition step, which consists of identification and verification by comparing it with 
existing data. This process can be done with previously developed algorithms [26]. 


Input 
Face Detection 


Feature Extraction 


Face Recognition 


Figure 1. Shows the flowchart of the AI-based models and experimental methods applied 


3.1. Gabor wavelet transform-based method 

Gabor wavelet transform (GWT) itself is often used as a method to suppress noise in images [27]. 
Beginning with research conducted by Saraydemir [19], a study was conducted with data taken from one of 
the universities in Turkey and the down syndrome association. It starts with pre-processing the image and 
using GWT to extract the image's features. GWT is used because it has previously been proven that its 
robustness against local distortion is suitable for face recognition [28]. Gabor wavelets have been found to 
produce distortion-tolerant feature spaces for other pattern recognition tasks, including textures. Then the 
classification in this study uses k-NN and SVM, which have been used in several other studies because they 
are very good at classifying existing patterns. After the dimensions of the entire image have been adjusted, 
the classification accuracy is carried out by 96% and 97.34%, respectively, using the k-NN and SVM 
methods [19]. 

In contrast, Zhao et al. 2013 compared the combined features with geometric and Gabor jet and 
LBP-based geometric texture features on individuals with down syndrome from a variety of ethnic 
backgrounds. When combined and then assisted with SVM, both will get outstanding results with an 
accuracy of 0.970. According to the researchers, this solution is a simple, affordable, instant, and accurate 
solution for doctors [29]. 

Then a study conducted in the same year by Kosilek et al. was conducted for patients with Cushing's 
Syndrome in women who were on an outpatient basis at Mannheim University Hospital. Where the data used 
is a front view (frontal) and sideways with the camera. Then given, a label on each data obtained to make it 
easier when doing the classification later. The classification in this study compares the image's geometry and 
texture with Gabor Jet's help to determine the texture. This analysis obtained 91.7% accuracy with 92% 
specificity and 96% sensitivity. This study has strength in its simplicity by requiring only two photographs of 
each patient [30]. 


Face recognition in identifying genetic diseases: a progress review (Salsabila Aurellia) 


1022 O ISSN: 2252-8938 


3.2. Texture analysis method 

Several studies use semi-automatic methods in feature learning [31]—[33]. Research conducted by 
Basel-Vanagaite et al. for people with Cornelia de Lange syndrome in 2016 used Bayesian networks to 
calculate various local features such as the ratio of the distance from the face to identify or indicate 
dysmorphic features and to evaluate the similarity of the trained syndrome. Meanwhile, to capture the 
appearance of the entire face, local binary patterns (LBP) are used. This research finally resulted in the 
system being able to classify correctly by 87% greater than the previous research. In addition, a sensitivity of 
86% and specificity of 89% were obtained. The system only makes errors in light cases three times, and the 
system consistently detects and continues to improve in terms of learning [31]. 

One year later, research was conducted by Hadj-Rabia et al. This study was conducted to detect 
X-linked hypohidrotic ectodermal dysplasia (XLHED) patients, including neonates and carriers of the 
XLHED gene, using facial dysmorphology novel analysis software (FDNA). FDNA itself has been widely 
used for various cases of genetic diseases. The result of this study is that this method can distinguish male 
and female XLHED patients with reasonable accuracy, with the sensitivity of all data obtained at 75% and 
specificity maintained at 99% [32]. 

Similarly, another study in the same year conducted by Liehr ef al. 2017 also used FDNA for a 
different application, namely for patients with Emanuel syndrome and Pallister-Killian Syndrome. The 
results obtained in this study show that it can reduce the time in diagnosis and can differentiate Emanuel 
syndrome (ES) and Pallister-Killian syndrome (PKS) well. This study also proved that the solution used is 
cheap, can be used in places that do not have access to more sophisticated genetic approaches, and has the 
best accuracy of 92.8% [33]. 


3.3. Deep learning-based method 

Deep learning is very often used in image analysis which is currently the focus of many researchers 
in its development. In deep learning, several hidden layers are interconnected, making the model learn to 
deliver the output, as shown in Figure 2. CNN is a type of deep learning algorithm. CNN itself has been 
widely applied to the development of artificial intelligence in the medical imaging field [34]. The various 
architectures of CNN itself successfully identify diseases from various medical images because of the high 
frequency and excellent recognition rate [35]. So, of course, in the last five years, many studies have focused 
on the development of CNN in detecting genetic diseases through facial recognition [7], [20], [36]-[40]. 


Inputs Output 


Input Layer Hidden Layer 1 Hidden Layer n 


Figure 2. Deep neural network 


Developments also lead to more complex problems, whereas Singh and Kisku in 2018 [36] and 
Gurovich et al. in 2019 [37] have identified various genetic diseases and the results obtained are pretty good. 
The research of Singh and Kisku [36] used CNN with visual geometric group (VGGFace) and residual 
network (ResNet) 50 architecture with the help of stochastic gradient descent (SGD) optimization to classify 
12 genetic diseases. The architecture with SGD optimization is quite good when combined and produces an 
accuracy of 97.66%. Nevertheless, the research of Singh and Kisku [36] admits that it has shortcomings in 
the quality of the data used is not good. This model is not good in the classification of Marfan syndrome and 
22q11. However, the overall accuracy is still reasonably good [36]. 

Meanwhile, the research of Gurovich ef al. in the next year used deep CNN to solve more complex 
problems in 200 genetic diseases [37]. This is, of course, difficult to do, but Gurovich achieved an accuracy 
of 91% by outperforming physicians in three initial trials [37]. Likewise, Qin et al. in 2020 also used deep 
CNN to identify people with Down's Syndrome with very good accuracy of 95.87% and a specificity of 
97.40%, which is helpful for early screening and prevention of disease progression [38]. 
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Then the research in 2021 for facial recognition is quite a lot, coupled with the CNN method. This 
year it is pretty popular for all medical images. This was also done by Hong et al. by using the CNN 
architecture, namely visual geometric group-16 (VGG-16). The results were evaluated using five-fold cross- 
validation to detect genetic syndromes, with an accuracy of 88.6% and a specificity of 91.24% [20]. 
Followed up in the same year, Liu et al. [7] used several CNN architectures in order to compare which CNN 
architecture is the best in identifying Williams-Beuren syndrome, such as the VGG-16, VGG-19, ResNet 18, 
ResNet 34, MobileNet-V2, and ImageNet architectures from the 340 data obtained, this study produces the 
best accuracy at 92.7% of the VGG-19 architecture which means that other architectures are not suitable for 
the identification of Williams-Beuren syndrome. In contrast, the VGG-19 architecture is the most suitable for 
this disease [7]. 

In addition, the most recent is the first by Yang et al. in 2021 to detect Noonan syndrome with deep 
CNN assisted by Additive Angular Margin (ArcFace) with a total of 430 data obtained from Guangdong 
Provincial People's Hospital [39]. This study outperformed the six doctors (reference) in accuracy, 
sensitivity, and specificity, by achieving an accuracy of 92.01% and 97.97%. This model has an excellent 
representation ability in predicting the output [39]. Second, the study by Geremek and Szklanny from 944 
data from the facial collection site, UTKFace, classified 15 genetic diseases for children aged 5-12 years. The 
results of this study get an accuracy of 84%, where the model can detect abnormalities without requiring 
information about specific abnormalities. The system does not have to be trained with all genetic diseases to 
detect genetic features on existing faces [40]. A summary of all the research described can be seen in Table 1. 


Table 1. Summary and comparison of facial recognition methods in identification of genetic diseases 


Authors and 


Genetic Disease 
Year 


Algorithm Dataset Results 


Saraydemir 
et al. (2012) 


The dataset was taken from 
universities in Turkey and the 


Gabor Wavelet Transform 
and k-NN-support vector 


Down Syndrome Classification using SVM resulted in 


96% and 97.34% accuracy. 


19] machine (SVM) Down syndrome association 
in Turkey. 
Zhao et al. Down Syndrome Local binary patterns (LBP), 130 Data from various The highest accuracy was obtained at 


(2014) [29] Gabor wavelet transform, and 


SVM. 


ethnicities, 50 Down 
Syndrome; 80 healthy. 


0.967 with an F1 value of 0.956 with 
combined geometric and Gabor Jet 
features. The LBP accuracy is 0.970. 
The accuracy of the software that can 
perform the classification is 91.7%. 


Kosilek et al. 
(2013) [30] 


Cushing's Syndrome 20 female endocrine 
outpatients at Mannheim 
University Hospital 
Experiment 1: 31 data; 


Classification by comparing 
the texture (Gabor Jets) and 
geometry of the image. 
Basel-Vaganite Local Binary Patterns and 


Cornelia de Lange The detection rate of the system is 87%. 


et al. (2016) Syndrome Bayesian Networks Experiment 2: 17 Data 
31) 
Hadj-Rabia X-Linked Facial Dysmorphology Novel 27 frontal data The sensitivity of all data obtained is 
et al. (2017) Hypohidrotic Analysis 75%, and the specificity is 99%. 
32] Ectodermal dysplasia 
(XLHED) 
Liehr et al. Emanuel Syndrome FDNA Technology 2,173 Data (Healthy, PKS, The average accuracy is 89.6%, and the 
(2017) [33] and Pallister-Killian ES) best accuracy is 92.8%. 
Syndrome 
Singh and 12 syndromes VGGFace and ResNet 50 1567 images Produces an accuracy of 97.66%. 
Kisku (2018) with SGD optimizer 
36] 
Gurovich et al. 200 syndromes Cascaded DCNN 17,000 data This technology only identifies a few 
(2019) [37] (DeepGestalt) disease phenotypes. DeepGestalt 
outperformed doctors in three initial trials 
with 91% accuracy. 
Qin et al. Down Syndrome Deep CNN 10,562 (training data), 405 With accuracy is 95.87%, and specificity 


(2020) [38] (test data) is 97.40% when identifying Down 
syndrome. 
Hong et al. Genetic syndrome CNN (VGG-16) and 456 data from Guangdong The accuracy is 0.8860, the specificity is 


(2021) [20] 
Liu et al. (2021) Williams-Beuren 
[7] Syndrome 


Yang et al. 
(2021) [39] 


Noonan Syndrom 


Geremek and 
Szklanny 
(2021) [40] 


15 genetic diseases 


evaluated by five-fold cross- 
validation 
CNN (VGG-16, VGG-19, 
ResNet-18, ResNet-34, 
Mobile Net-V2 and 
ImageNet). 

Deep CNN and Additive 
Angular Margin (DCNN- 
Arcface model) 
Multi-task Cascaded 
Convolutional Neural 
Network (P-Net, R-Net, and 
O-Net) 


Provincial People's Hospital 
340 data (Guangdong 
Provincial People's Hospital) 
430 data from Guangdong 
Provincial People's Hospital 


944 data, ranging from 5-12 
years old from UTKFace. 


0.9124, and the Fl-Score is 0.8829. 


By using VGG-19, the best accuracy is 
92.7%, where this architecture is the most 
suitable when diagnosing Williams- 
Beuren Syndrome. 

Achieved accuracy of 0.9201 and 0.9797. 


The classification with the best accuracy 
was obtained at 84%. 


4. CONCLUSION 


Facial recognition for genetic diseases has been demonstrated in this systematic review's studies 
over the last ten years. It can be seen in this review that facial recognition for genetic diseases is perfect and 
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can recognize genetic diseases that are increasingly complex in recent studies. Even various existing 
methods, such as the GWT approach, texture analysis, and even deep learning, it has been developed, and it 
has been proven that in several studies, it is said to be able to simplify the work of doctors and reduce 
diagnosis time which, of course without the help of artificial intelligence would take weeks. This, in the 
future, still opens up opportunities for researchers to continue to develop data, methods, and algorithms in a 
photo and facial video recognition and perhaps can be combined with other biometrics. 
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